ftp.cs.arizona.edu

home *** CD-ROM | disk | FTP | other *** search

/ ftp.cs.arizona.edu / ftp.cs.arizona.edu.tar / ftp.cs.arizona.edu / icon / newsgrp / group98c.txt / 000008_icon-group-sender _Thu Sep 10 16:54:52 1998.msg < prev next >

Wrap

Internet Message Format | 2000-09-20 | 3KB

Return-Path: <icon-group-sender> Received: from kingfisher.CS.Arizona.EDU (kingfisher.CS.Arizona.EDU [192.12.69.239]) by baskerville.CS.Arizona.EDU (8.9.1a/8.9.1) with SMTP id QAA04917 for <icon-group-addresses@baskerville.CS.Arizona.EDU>; Thu, 10 Sep 1998 16:54:52 -0700 (MST) Received: by kingfisher.CS.Arizona.EDU (5.65v4.0/1.1.8.2/08Nov94-0446PM) id AA31445; Thu, 10 Sep 1998 16:54:25 -0700 To: icon-group@optima.CS.Arizona.EDU Date: 10 Sep 1998 20:00:24 GMT From: jeffery@cs.utsa.edu (Clinton Jeffery) Message-Id: <6t9b4o$8rs$1@ringer.cs.utsa.edu> Organization: The University of Texas at San Antonio Sender: icon-group-request@optima.CS.Arizona.EDU References: <35F723CF.76B3CC97@Japan.NCR.COM> Reply-To: jeffery@cs.utsa.edu Subject: Re: Unicode support or support for non-Ascii based character manipulation? Errors-To: icon-group-errors@optima.CS.Arizona.EDU Status: RO Eric Hildum (Eric.Hildum@Japan.NCR.COM) wrote (and I paraphrase/edited): : Icon ... supporting only ASCII makes it less useful for non-English language : With Unicode... it should be possible to begin including support for : non-English and non alphabetic languages. : Has anyone thought about this yet? What does string and pattern matching : mean in, for example, Japanese? 1. Other folks have been thinking about it, especially Icon users in Asia. For example, a Chinese version of Icon has been done by researchers in China. 2. Going to Unicode might not be *that* difficult, but I think Unicode isn't really as widely adopted as you might suggest. Many people seem to be using mixed 8/16-bit strings. 3. The semantics of string and pattern matching are no different in Japanese than in English. There is nothing specific to language or grammar in the Icon string and pattern matching repertoire. Of course, when the character set changes the actual code needs to change... 4. Let's look at the current situation for mixed-character sets. I am not sure how Chinese Icon stands on these, but consider plain-old Windows Icon. Divide functionality as follows: non-alphabetic output: Windows Icon already can do this non-alphabetic input: we have known bugs in the input processing of these, either in Windows Icon or the IPL "vidgets" code. non-alphabetic string scanning: not supported, but could be implemented as Icon Program Library procedures. Even Unicode string semantics could be implemented as library procedures on top of (even length!) Icon strings. We don't really need much additional infrastructure. Some folks in the user community could coordinate the library procedures to do this as an interesting project. We do also need someone who can compile Icon from its C code and debug I/O problems on a non-alphabetic platform at this point. -- Clint Jeffery, jeffery@cs.utsa.edu Division of Computer Science, The University of Texas at San Antonio Research http://www.cs.utsa.edu/research/plss.html